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We report a study of the invariant mass distribution of jet pairs produced in association with a 
W boson using data collected with the CDF detector which correspond to an integrated luminosity 
of 4.3 fb _1 . The observed distribution has an excess in the 120-160 GeV/c 2 mass range which is 
not described by current theoretical predictions within the statistical and systematic uncertainties. 
In this letter we report studies of the properties of this excess. 
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Measurements of associated production of a W boson 
and jets are fundamental probes of the electroweak sector 
of the standard model (SM) and are an essential start- 
ing point for searches for physics beyond the SM. Sev- 
eral important processes share this signature, such as di- 
boson production, associated production of a W and a 
light Higgs boson and searches for new phenomena [TJ [5] . 
At the Fermilab Tevatron collider the DO collaboration, 
using a data sample corresponding to an integrated lu- 
minosity of 1.1 fb _1 , reported first evidence for the pro- 
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duction of either an additional W or a Z boson in asso- 
ciation to a W boson (WW or WZ diboson production) 
in a lepton plus jets final state [3]. The CDF collabo- 
ration recently measured the cross section for the same 
channel as described in Ref . [4 . One of the two methods 
described in the CDF work uses the invariant mass of 
the two-jet system (Mjj) to extract a WW + WZ sig- 
nal from data. Here we perform a statistical comparison 
of that spectrum with expectations by including addi- 
tional data and further studying the Mjj distribution for 
masses higher than 100 GeV/c 2 , with minimal changes 
to the event selection with respect to the previous analy- 
sis. We find a statistically significant disagreement with 
current theoretical predictions. 

The parts of the CDF II detector [5] relevant to this 
analysis are briefly described here. The tracking sys- 
tem is composed of silicon microstrip detectors and an 
open-cell drift chamber inside a 1.4 T solenoid. Electro- 
magnetic lead-scintillator and hadronic iron-scintillator 
sampling calorimeters segmented in a projective tower 
geometry surround the tracking detectors. A central 
calorimeter covers a pseudorapidity range | ry | < 1.1, while 
"plug" calorimeters extend the acceptance into the region 
1.1 < \rj\ < 3.6 6J. Outside the calorimeters are muon 
detectors composed of scintillators and drift chambers. 
Cherenkov counters around the beam pipe provide the 
collider luminosity measurement [7]. 

The trigger selection used to collect the data sample 
required a central and high pt electron (muon). Fur- 
ther event selection requirements are applied offline to re- 
ject backgrounds and reduce the sensitivity to systematic 
uncertainties. We require the presence of one electron 
(muon) candidate with E T (p T ) > 20 GeV (GeV/c) and 
|ry| < 1.0 plus missing transverse energy $r > 25 GeV. 
Both electrons and muons are required to be isolated 
(Iso < 0.1) 8J to reject leptons from semileptonic de- 
cays of heavy flavor hadrons and hadrons misidentified 
as leptons. Jets are clustered using a fixed-cone algo- 
rithm with radius AR = ^(Arj) 2 + (A<j)) 2 = 0.4, and 
their energies are corrected for detector effects that are 
of the order of 25% for jet E T = 30 GeV [9 . Jets with 
an electron or muon in a cone AR — 0.52 around the jet 
axis are removed. Cosmic rays and photon-conversion 
candidates are removed. We require events to have ex- 
actly two jets each with Et > 30 GeV and |7y| < 2.4, and 
the dijet system to have pr > 40 GeV/c. 

The transverse mass Mt(W) [6] of the lepton +$t sys- 
tem must be greater than 30 GeV/c 2 ; the two jets must 
be separated by \Ar]\ < 2.5. To suppress multijet back- 
ground, we further require that the direction of$r and 
of the most energetic jet are separated azimuthally by 
\A<j)\ > 0.4. 

To remove contamination from Z production, we reject 
events where an additional lepton is found using looser 
criteria and the invariant mass of the two leptons is in 
the range 76-106 GeV/c 2 . We further reject events with 



two identified leptons, where the Et (pt) threshold for 
the second lepton is decreased to 10 GeV (GeV/c), to 
suppress other sources of real dileptons such as leptonic 
decays of both final state W's in ti and dibosons with 
jets. The main difference with respect to the selection 
criteria used in Ref. [I] is that the jet Et threshold is 
increased from 20 GeV to 30 GeV, motivated by the in- 
terest in a higher invariant mass range. This analysis 
critically depends on the shape of the steeply falling di- 
jet mass distribution. For this reason, we verified by 
Monte Carlo studies that our selection does not sculpt 
the dijet invariant mass distribution of any process ex- 
pected to contribute to the sample at masses above 100 
GeV/c 2 . The resulting sample is dominated by events 
where a W boson, which decays leptonically, is produced 
in association with jets (W+jets). Minor contributions 
to the selected sample come from WW+WZ, ti, Z+jets, 
single top production and multijet QCD sources. Predic- 
tions for these processes, with the exception of the mul- 
tijet QCD component, are obtained using event genera- 
tors and a GEANT-based CDF II detector simulation [10] . 
The diboson, ti, and single top components are simulated 
using the Pythia event generator flT] . The W+jets 
and Z+jets processes are simulated using a matrix el- 
ement Leading Order event generator Alpgen [12] with 
an interface to Pythia providing parton showering and 
hadronization [T3][T3]. Multijet QCD events, where one 
jet is misidentified as a lepton, are modeled with data 
containing anti-isolated muons (Iso > 0.2) or candidate 
electrons failing quality cuts [14]. The normalization of 
the ij+jets component is based on the measured cross 
section |15j . while for ti, single top, and diboson pro- 
duction the NLO predicted cross sections are used [IB] . 
The detection efficiencies for Z+jets, ti, single top, and 
diboson contributions are determined from simulation. 
The normalization of the multijet QCD component and 
a preliminary estimation of the W+jets component are 
obtained by fitting the Mt spectrum in data to the sum 
of all contributing processes. 

We perform a combined binned x 2 fit, for electron and 
muon events, to the dijet invariant mass (Mjj) spec- 
trum using predictions for the multijet QCD, WW, WZ, 
Z+jets, W+jets, ti, and single top processes. The fi- 
nal W+jets normalization is determined by minimizing 
this x 2 an d all other contributions are constrained to be 
within the variance of their expected normalization. 

We fit the dijet mass distribution in the range 28- 
200 GeV/c 2 defined a priori in the measurement of the 
WW/WZ cross section g]. Figs. [I] (a) and (b) show the 
extrapolation of this fit in the extended range of mass 
up to 300 GeV/c 2 . The fit is stable with respect to 
changes in the fit range and histogram binning. Our 
model describes the data within uncertainties, except in 
the mass region ~ 120-160 GeV/c 2 , where an excess over 
the simulation is seen. The fit x 2 / n df is 77.1/84, where 
ndf is the number of degrees of freedom. The % 2 /ndf 
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TABLE I: Results of the combined fit. The ratios of the num- 
ber of events in the excess to the number of expected diboson 
events in the electron and muon samples are statistically com- 



patible with each other. 




Electrons 


Muons 


Excess events 


156 ± 42 


97 ±38 


Excess events / expected diboson 


0.60 ±0.18 


0.44 ±0.18 


Mean of the Gaussian component 


144 ±5 


GeV/c 2 



computed only in the region 120-160 GeV/c 2 is 26.1/20. 
However the Kolmogorov-Smirnov (KS) test, which is 
more sensitive to a localized excess, yields a probability 
of 6 x 10" 5 Q2]- 

We try to model the excess with an additional Gaus- 
sian peak and perform a A^ 2 test of this hypothesis. The 
Gaussian is chosen as the simplest hypothesis compati- 
ble with the assumption of a two jet decay of a narrow 
resonance with definite mass. The width of the Gaus- 
sian is fixed to the expected dijet mass resolution by 
scaling the width of the W peak in the same spectrum: 

o-rcsoiution = °w\f§^ = 14.3 GeV/c 2 , where <r w and 
Mw are the resolution and the average dijet invariant 
mass for the hadronic W in the WW simulations respec- 
tively, and Mjj is the dijet mass where the Gaussian tem- 
plate is centered. 

In the combined fit, the normalization of the Gaus- 
sian is free to vary independently for the electron and 
muon samples, while the mean is constrained to be the 
same. The result of this alternative fit is shown in Figs.[l] 
(c) and (d). The inclusion of this additional component 
brings the fit into good agreement with the data. The 
fit % 2 /ndf is 56.7/81 and the Kolmogorov-Smirnov test 
returns a probability of 0.05, accounting only for statis- 
tical uncertainties. The W±jets normalization returned 
by the fit including the additional Gaussian component is 
compatible with the preliminary estimation from the &x 
fit. The x 2 /ndf in the region 120-160 GeV/c 2 is 10.9/20. 
The values of parameters returned by the combined fit 
are shown in Table [TJ where the mean of the Gaussian 
peak represents the experimentally measured value i.e. 
it is not corrected back to the parton-level. 

We take the difference between the x 2 of the two fits 
(Ax 2 ), with and without the additional Gaussian struc- 
ture to assess the significance of the excess. The expected 
distribution of A\ 2 is computed numerically from sim- 
ulated background-only experiments and used to derive 
the p-value corresponding to the A% 2 actually observed. 
In order to account for the trial factor within our search 
window, 120-200 GeV/c 2 , in each pseudoexperiment we 
calculate the Ax 2 varying the position of the Gaussian 
component in steps of 4 GeV/c 2 . The largest A% 2 for 
each pseudoexperiment is used to define the p-value dis- 
tribution. 



In deriving the p-value we account for systematic un- 
certainties that affect the background shapes and the 
normalization of constrained components. Normalization 
uncertainties of unconstrained components are consid- 
ered as part of the statistical uncertainty. The largest 
systematic uncertainties arise from the modeling of the 
W+jets and multijet QCD shapes. For VT±jets we con- 
sider, as an alternative, the Mjj distributions obtained 
by halving or doubling the renormalization scale (Q 2 ) 
in Alpgen. For multijet QCD, we change our model 
using different lepton isolation ranges. The systematic 
uncertainty due to uncertainties in the jet energy scale 
(±3%) affects all components with the exception of multi- 
jet QCD, which is derived from data. For each systematic 
effect we consider the two extreme cases. For each of the 
possible combinations of systematic effects we calculate 
a different A% 2 distribution and take the conservative 
approach of using the distribution that returns the high- 
est p-value. The total systematic effect on the extracted 
number of excess events, defined as the number of events 
fitted by the Gaussian component, in the electron and 
muon samples is found to be 10% and 9%, respectively. 
The dominant systematic effects arise from the W+jets 
renormalization scale (6.7%), the jet energy scale (6.1%) 
and QCD shape (1.9%). Assuming only background con- 
tributions, and systematic errors, the probability to ob- 
serve an excess larger than in the data is 7.6 x 10~ 4 cor- 
responding to a significance of 3.2 standard deviations 
for a Gaussian distribution. For comparison, the p-value 
without taking into account systematic uncertainties is 
9.9 x 10~ 5 . 

To investigate possible mismodeling of the I4 7 ±jets 
background we consider various configurations of our sys- 
tematic uncertainties. The combination of systematic 
uncertainties that fits the data best is shown in Fig. [2] 
(a) where Q 2 is doubled and the QCD shape is varied. 
The KS probability for this fit is 0.28. The fit x 2 /ndf 
outside the 120-160 GeV/c 2 region is 50.3/66, indicating 
that the dijet mass distribution is well modeled within 
our systematic uncertainties. This choice of systematic 
uncertainties returns a p-value intermediate between the 
central configuration and the most conservative combi- 
nation. In order to test "Next to Leading Order" con- 
tributions to the W+2 partons prediction, we compare 
a sample of W+2 partons simulated with Alpgen and 
interfaced to Pythia for showering to a sample of W±2 
partons simulated using the MCFM generator |18) . We 
extract a correction as a function of Mjj that is applied to 
the Alpgen + Pythia sample used in our background 
model. The statistical significance obtained with the 
MCFM reweighted VF±jets background model is 3.4tr. 

Details of a large set of additional checks can be found 
in Ref . 14J . In particular we verified that the background 
model describes the data in several independent control 
regions and satisfactorily reproduces the kinematic dis- 
tributions of jets, lepton, and $t- The excess is stable 
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FIG. 1: The dijet invariant mass distribution. The sum of electron and muon events is plotted. In the left plots we show the 
fits for known processes only (a) and with the addition of a hypothetical Gaussian component (c). On the right plots we show, 
by subtraction, only the resonant contribution to Mjj including WW and W Z production (b) and the hypothesized narrow 
Gaussian contribution (d). In plot (b) and (d) data points differ because the normalization of the background changes between 
the two fits. The band in the subtracted plots represents the sum of all background shape systematic uncertainties described 
in the text. The distributions are shown with a 8 GeV/c 2 binning while the actual fit is performed using a 4 GeV/c 2 bin size. 



against 5 GeV variations of the thresholds used for all of 
the kinematic selection variables, including variations of 
the jet Et > 30 GeV threshold. This analysis employs 
requirements on jets of Et > 30 GeV and pr > 40 GeV/c 
for the dijet system, which improves the overall modeling 
of many kinematic distributions. We also test a selection 
only requiring jet Et > 20 GeV as in Ref. [IS]. This se- 
lection, which increases the background by a factor of 4, 
reduces the statistical significance of the excess to about 
la. 



We study the Ai?^ distribution to investigate possi- 
ble effects that could result in a mismodeling of the dijet 
invariant mass distribution. We consider two control re- 
gions, the first defined by events with Mjj < 115 and 
Mjj > 175 GeV/c 2 and the second defined by events 
with p T < 40 GeV/c. We use these regions to de- 
rive a correction as a function of A.Rjj to reweight the 
events in the excess region. We find that the reweight- 
ings change the statistical significance of the result by 
plus or minus one sigma. However, the ARjj distribu- 
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FIG. 2: The dijet invariant mass distribution for the sum of electron and muon events is shown after subtraction of fitted 
background components with the exception of resonant contribution to Mjj including WW and WZ production and the 
hypothesized narrow Gaussian contribution (a). With respect to Figure 1, the subtracted background components are chosen 
as the systematic combination that best fit data (see text). The fit xV n df is 62.0/8f . (b) ARjj distribution for events with 
Mjj < If 5 and Mjj > f 75 GeV/c 2 of the data compared to the background estimation that corresponds to the same systematic 
combination of (a). The uncertainty band corresponds to background statistical uncertainty. 



tion is strongly correlated to Mjj and the control regions 
both have significantly different distributions of ARjj. 
Reweighting our VF+jets sample to correct for the dif- 
ferences observed in ARjj in the control samples may 
be indicative of the effect of correcting ARjj mismod- 
eling or may introduce bias in the Mjj distribution. In 
addition, the ARjj distribution is consistent within the 
one sigma variation of the systematic uncertainties for 
events outside the excess mass region as shown in Fig. [2] 
(b) . The data-background comparison of the ARjj dis- 
tribution has x 2 / n df of 26.7/18 and a KS probability 
of 0.022 when compared with best-fit systematic model. 
For these reasons, we present these studies as cross checks 
and quote the significance in the unweighted sample as 
our primary result. 

We look for evidence in favor or against the hypothe- 
sis that the excess in the 120-160 GeV/c 2 mass range is 
from a new (non-SM) physics source. Since non-SM par- 
ticles may in general couple to both massive electroweak 
gauge bosons we have investigated the shape of the dijet 
mass distribution in Z+jcts events. In this sample the 
number of events in the data is approximately a factor 15 
less than in the IF+jets sample and no statistically sig- 
nificant deviation from the SM expectation is observed. 
We increase the jet Et threshold in steps of 5 GcV and 
check the fraction of excess events that are selected as 
a function of the jet Et- The result is compatible with 
expectation from a Monte Carlo simulation of a W boson 
plus a particle with a mass of 150 GeV/c 2 and decaying 
into two jets [T3]. In this model, we estimate a cross 
section times the particle branching ratio into dijets of 



the order of 4 pb. The cross section of the observed ex- 
cess is not compatible with SM WH production whose 
a-BR(H -> 66) is about 12 fb for m H = 150 GeV/c 2 [20]. 
To check the flavor content with this selection, we iden- 
tify jets originating from a b-quark by requesting a dis- 
placed secondary vertex for tracks within the jet cone. 
We compare the fraction of events with at least one b- 
jct in the excess region (120-160 GeV/c 2 ) to that in the 
sideband regions (100-120 and 160-180 GeV/c 2 ), and find 
them to be compatible with each other. Dedicated CDF 
searches for WH — > Ivbb using events with reconstructed 
displaced vertices from b hadron decay, and looser selec- 
tion criteria, have not found any significant excesses us- 
ing final analysis discriminants trained to identify Higgs 
bosons in the mass range 100-150 GeV/c 2 [TO] . 

Finally, to investigate the possibilities of a parent res- 
onance or other quasi-resonant behavior, we consider the 
M (1 eptou,i/jj) and the M(i cpton l/J:) ) — Mjj [2TJ distribu- 
tions for events with Mjj in the range 120-160 GeV/c 2 
and, to investigate the Dalitz structure of the excess 
events, the distribution of M(i epton ^jj) — Mjj, in bins 
of Mjj. The distributions are compatible in shape with 
the background-only hypothesis in all cases. 

In conclusion, we study the invariant mass distribu- 
tion of jet pairs produced in association with a W boson. 
The best fit to the observed dijet mass distribution using 
known components, and modeling the dominant W+jets 
background using Alpgen+Pythia Monte Carlo, shows 
a statistically significant disagreement. One possible way 
to interpret this disagreement is as an excess in the 120- 
160 GeV/c 2 mass range. If we model the excess as a 



Gaussian component with a width compatible with the 
dijet invariant mass resolution, and perform a A% 2 test 
for the presence of this additional component, we obtain 
a p- value of 7.6 x 10 -4 , corresponding to a significance 
of 3.2 standard deviations, after accounting for all statis- 
tical and systematic uncertainties. 
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